# Infodemic pathways: Evaluating the role that traditional and social media play in cross-national information transfer replication file

This replication file contains files and data necessary for a partial replication of the Infodemic pathways, published in Frontiers in Political Science. Partial replication because Twitter terms of service prohibit sharing user-specific information as well as non status_id fields. If you are interested in more data/scripts, please get in touch with the corresponding author at aengus.bridgman@mail.mcgill.ca.

## Data

The data directory contains:
* fig2.csv: the data for Figure 2. Variables are:
  * random_id: a randomly generated unique id for each individual. These are generated for this file only.
  * United States: the number of follow relationships the individual has to U.S.-based twitter users
  * Canada: the number of follow relationships the individual has to Canada-based twitter users
  * ratio: the ratio between U.S. and 
  * Color: a variable that 
* fig3.csv: the data for Figure 3
* misinfo_sds.csv: The data for the regression computing the relationship between tweets containing misinformation and number of US follows. Variables are:
  * random_id: a randomly generated unique id for each individual. These are generated for this file only and do not match those from fig2.csv
  * total_tweets: the number of tweeets they produced during the period examined. Note that this excludes retweets.
  * total_misinfo: total number of tweets identified as containing or pertaining to misinformation
  * misinfo_percentage: total_misinfo/total_tweets
  * misinfo_sds: the misinfo_percentage as expressed in standard deviations away from the overall mean
  * total_follows: the number of follow relationships the individual has where a location could be identified
  * us_follows: the number of follow relationships the individual has for those identified as based in the United States
  * us_percentage: the percentage of all geolocated follows that can be identified as based in the United States
  * us_sds: the us_percentage as expressed in standard deviations away from the overall mean
* supp_fig1.csv: the data for Supplementary Figure 1
* survey.dta: the survey data file

## Analysis

The analyses for the paper were done in Stata and R.

* infodemic.R contains the code analyzing the social media data and producing Figures 2 and 3 in the main body and Figure 1 in the supplementary materials
* infodemic.do contains the code analyzing the survey data and producing Tables 1 and 2 and Figure 4 in the main text and Tables 2, 3, and 4 in the supplementary materials